Using extended phylogenetc profiles and support vector machines for protein family classification
نویسندگان
چکیده
We proposed a new approach to compare profiles when the correlations among attributes can be represented as a tree. To account for these correlations, the profile is extended with new bits corresponding to the internal nodes of the tree, which encode the correlations. An ad hoc scoring scheme is adopted for measuring the similarity among these extended profiles, and the scores thus obtained are then provided to a classifier -a support vector machine using a polynomial kernel function -for classification. The effectiveness of the proposed scoring scheme is assessed by the classifier’s improved accuracy. As an application the method is used to classify proteins into their functional families based on the phylogenetic profiles It is shown that the performance is much improved than using simple hamming distances and is also better than using a Bayesian based tree kernel.
منابع مشابه
Use of Extended Phylogenetic Profiles with E-Values and Support Vector Machines for Protein Family Classification
Protein family classification is an important means to assign functions to proteins, and use of phylogenetic profiles, which encode evolutionary history of proteins along with putative homologs, has proved to facilitate protein family classification. We proposed a new approach to compare phylogenetic profiles by incorporating the phylogenetic tree, from which the profiles are derived. Specifica...
متن کاملFace Recognition using Eigenfaces , PCA and Supprot Vector Machines
This paper is based on a combination of the principal component analysis (PCA), eigenface and support vector machines. Using N-fold method and with respect to the value of N, any person’s face images are divided into two sections. As a result, vectors of training features and test features are obtain ed. Classification precision and accuracy was examined with three different types of kernel and...
متن کاملFault diagnosis in a distillation column using a support vector machine based classifier
Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...
متن کاملA comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater
The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...
متن کاملA QUADRATIC MARGIN-BASED MODEL FOR WEIGHTING FUZZY CLASSIFICATION RULES INSPIRED BY SUPPORT VECTOR MACHINES
Recently, tuning the weights of the rules in Fuzzy Rule-Base Classification Systems is researched in order to improve the accuracy of classification. In this paper, a margin-based optimization model, inspired by Support Vector Machine classifiers, is proposed to compute these fuzzy rule weights. This approach not only considers both accuracy and generalization criteria in a single objective fu...
متن کامل